Random-cluster multi-histogram sampling for the g-state Potts model 
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Using the random-cluster representation of the g-state Potts models we consider the pooling 
of data from cluster-update Monte Carlo simulations for different thermal couplings K and 
number of states per spin q. Proper combination of histograms allows for the evaluation of 
thermal averages in a broad range of K and q values, including non-integer values of q. Due to 
restrictions in the sampling process proper normalization of the combined histogram data is non- 
trivial. We discuss the different possibilities and analyze their respective ranges of applicability. 
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I. INTRODUCTION 

During the last decade the question of how to make 
most efficient use of the data sampled during a Monte 
Carlo (MC) simulation has received an increasing amount 
of attention. The idea of reweighting [y of time-series 
data from a single canonical simulation at a given fixed 
value of a coupling parameter (i.e., most commonly tem- 
perature or magnetic field) to nearby regions of the 
coupling-parameter space, allows for the analysis of ther- 
mal averages as continuous functions of external pa- 
rameters and thus a much more precise determination 
of extremal, pseudo-critical points. As an extension of 
this, the combination of data from simulations at differ- 
ent points in the coupling-parameter space, commonly 
known as multi-histogram technique 0, in principle al- 
lows to get accurate estimates for thermal averages over a 
macroscopical region of couplings from a relatively small 
number of simulation (that, however, generally has to be 
increased with the size of the system) . 

The basic problem with collapsing data from different 
simulations is that of finding the correct relative normal- 
ization of the single histograms. Consider the sampled 
energy histogram HxiiE) of, e.g., an Ising model simu- 
lation at the coupling Ki = J Pi, consisting of N energy- 
measurements. The thermal average of an observable 
A{E) at Ki is just given by the time-series average in the 
importance-sampling scheme and thus insensitive to the 
value of the partition function at that point. Combining 
two histograms, however, amounts to the combination of 
the temperature-independent expressions 
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for different simulations i, where the partition function 
Zxi appears as a normalization constant. Thus, for cor- 
rect relative normalization of the histograms to be com- 
bined, one has to know the ratio of partition functions 
Zk^ /Zki or, equivalently, differences in the free energy 
densities Jk^ ~ fKi at the simulated couplings Ki. In 
Ref. [g this problem has been solved by an iterative so- 
lution of self-consistency equations for the free energy 
differences at adjacent simulation couplings Ki. 

Since the combination given in Eq. (|l|) is nothing but 
an estimator for the density of (energy) states ri(_E), 
multi-histograming data analysis amounts to estimating 
the density of states of the variable that is thermody- 
namically conjugate to the considered coupling parame- 
ter. Going to the random-cluster representation of the 
Potts model, i.e., its interpretation as correlated perco- 
lation model P, 0, 0, the relevant density of states is 
given by the number g{b,n) of bond configurations with 
b bonds and n clusters on the lattice. Apart from gaining 
control over two parameters, the thermal coupling K and 
the number of states q, this language suggests the use of 
cluster estimators for thermal averages like correlation 
functions, which are known to yield a variance-reduction 
in certain situations [||. One of us |^, |8J has proposed 
a multi-histogram technique for the q-state Potts model 
and simulations at different temperatures making use of 
the sampling of cluster decompositions of the lattice as 
they occur in the Swendsen-Wang cluster-update algo- 
rithm Ipj. There, the relative normalization of the in- 
dividual histograms at couplings Ki is accomplished by 
making use of the known absolute number of configura- 
tions with b active bonds on the lattice, which is just 
given by the binomial \fj, £ being the total number of 
bonds of the lattice. While this method appears advan- 
tageous at first sight and gives nice results from the cases 
of percolation (q ^ 1) [|lO| and the Ising model {q = 2) 
B, we find that this procedure is not the best choice 
of normalization for simulations of Potts models with q 



larger than 2 or 3 and propose a different approach for 
normahzation to circumvent this problem. 

The outline of the paper is as follows. In Section II 
we restate the multi-histogram approach of Ref. ||] in 
the random-cluster representation ("RC histograming"), 
which was originally formulated for simulations at fixed q 
only, and generalize it to simulations of multiple q values. 
Applying it to the q = 10 Potts model in two dimensions 
we find large deviations from the expected results. As 
an alternative, in Section III we propose an adaptively 
normalized RC multi-histograming ansatz. We discuss 
details of its implementation and present a comparative 
reweighting analysis for the q = 10 case. For energy- 
related observables we also compare histograming in the 
random-cluster language to the sampling of the energy 
density of states in the well-known framework of his- 
tograming in the energy/magnetization language ("EM 
histograming") B. Comparing both methods, in Sec- 
tion IV we track down the observed deviations with the 
first ansatz to be a result of the application of the above 
mentioned normalization condition. This problem can 
thus be resolved by the second ansatz. Finally, Section 
V contains our conclusions. 



II. RC HISTOGRAMS AND ABSOLUTE 
NORMALIZATION 

Consider the Hamiltonian of the g-state Potts model 
in zero magnetic field. 
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on a general graph Q with Af sites and £ bonds. Trans- 
forming to the random-cluster representation H , the par- 
tition function becomes 

where the sum runs over all bond configurations Q' on 
the graph (subgraphs) , and K = /3J denotes the thermal 
coupling. Notice that the formulation (0) in contrast to 
that of Eq. (g) allows for a natural continuation of the 
model to non-integer values of the parameter q. Using 
the subgraph expansion of the g-state Potts in external 
field, one of us H has shown that the g-state Potts model 
can be considered as a bond-correlated percolation model 



(BCPM) with bond occupation probability p — 1^ 
Eq. (y) can be rewritten as: 
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where g{b, n) denotes the number of subgraphs of Q with b 
activated bonds and n clusters resulting therefrom. This 



purely combinatorial quantity corresponds to the density 
of states of the BCPM. 

The Swendsen-Wang cluster-update algorithm gener- 
ates bond configurations drawn from the equilibrium 
canonical distribution of this model. Thus, the proba- 
bility for the occurrence of a subgraph with b bonds and 
n clusters is given by 
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which in turn is the expectation value of the normal- 
ized sampled histogram of bond configurations, i.e., 
Pp,q{b,n) — {Hp^q{b,n)/N), where N denotes the length 
of the time series of measurements. Here, we separated 
the common factor exp(K£) from the partition function: 



Zpjg) = e"^"- Wp,g{g). 



(6) 



An estimator for the density of states g{b, n) is therefore 
given by 
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Since the reduced partition function Wp_q{g) is a priori 
unknown, the correct normalization of this estimator is 
not known at the beginning. Probably the most obvious 
way of fixing the normalization would be to estimate the 
reduced partition function Wp^q{g) directly from Eq. (|^). 
One can do better than that, however, by considering 
the accumulated density g{b), which is obviously just a 
binomial [fzl. 
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Imposing this restriction on the estimate g(6, n) also, one 
arrives at 
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so that the absolute values of g(6, n) are now fixed by £ 
independent normalization conditions, one for each num- 
ber of active bonds b. Thus we have the following esti- 
mate for the density of states S : 



g{b,n) = Cp^q{b)Hp^q{b,n)q' 



(10) 



Now, we want to combine the estimates g*^*^ (5, n) from 
several simulations at different parameters {pi,qi), i.e., 
we want to do multi-histograming in both parameters, p 
and q. Then, we have 

5(5,n)-^a,(6,n)g«(6,n), 5]a,(6,n) = l. (11) 

i i 

Since we want to minimize the variance ct^ [g] of the final 
estimate and the different simulations are statistically in- 
dependent, the correct choice of the weights a^ obviously 
is given by 
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From Eq. (jICl) we have 






-2"a2r^ 



.{b.n)] 



Cl..ib)qr^"Hp^^,^{b,n), (13) 



such that the variance-optimized estimate for g(b, n) be- 
comes 
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In writing this expression we aUow for several approxima- 
tions: first, we treat Cp-_q.{h) as a parameter in Eq. (|l3) 
instead of taking its own variance into account; this is jus- 
tified by the clear suppression of variance of this quantity 
as a sum as compared to the the variance of its summands 
Hp-^q.{b,n). Secondly, we take <7^[H{b,n)] = H{b,n), 
i.e., we treat the individual bins {b,n) as independently 
distributed according to an uncorrelated 1/A^ statistics, 
which will in general not be exactly fulfilled. Since those 
assumptions only affect the variance of the final esti- 
mate, however, and do not introduce a bias, we con- 
sider them justified. Finally, we do not take autocor- 
relations between successive measurements (&, n) into ac- 
count, i.e., we assume here and in the following that mea- 
surements in the sampling process are taken with a fre- 
quency around 1/rint, where ri„t denotes the integrated 
autocorrelation time, resulting in an effectively uncorre- 
lated time series. 

From the partition function Eq. (0) we infer the follow- 
ing cluster-language estimators for the free energy den- 
sity / — F/Af, the per-site internal energy u — U/Af and 
specific heat c^ = Cy/Af, 
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cf. Appendix A. Here, the estimated expectation value of 
an observable 0{b,n) is defined as 

(0(6,n)). ^ Z;^lig)e'''J2f29ib,n) 

6=0 11=1 
x/(l-p)^-^g"0(&,n). (16) 

For the evaluation of magnetic observables one has to 
distinguish percolating clusters, denoted by indices tt, 
from non-percolating, finite clusters, denoted by indices 
(p. Let {c{G')} be the set of clusters of a subgraph G" 
of the lattice and ndG') the number of sites in cluster c 
of G". Then, consider the following microcanonical aver- 
ages: 
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FIG. 1: Results from the EM and RC multi-histogram anal- 
yses of time series from 9 cluster update simulations of the 
two-dimensional q — 2 Potts model on & M — 16^ lattice, (a) 
Free energy density (left scale) and specific heat (right scale) 
as a function of the coupling K — (5 J as compared to the ex- 
act solution of Ref. [hl| . (b) Relative deviation of the results 
for the specific heat from the exact solution for both methods. 
(c) Relative deviation for the free energy. AH results shown 
are re-scaled from the q = 2 Potts model to the Ising model 
formulation to fit the results from Ref. [hl|. 



!.^(G'), (17) i.e., the average number of sites in percolating clusters 
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FIG. 2: Ratio of standard deviations for estimates of (a) the 
internal energy U and (b) the spontaneous magnetization M 
of the g = 2 Potts model on a A/" = 16^ lattice and RC and EM 
histogram analyses as a function of the coupling K — pj . The 
variances are estimated by a "jackknife" time series anal ysi s 
WM . The solid line of (a) shows the exact result of Eq. (|2^ ) 
and Ref . |ll| , the dashed line that for JV —^ oo from Ref. | {13| . 
In (b) we use the definition (M) ioi K < 0.8 and (A8) for 
K > 0.8. 
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FIG. 3: Internal energy of the two-dimensional g = 10 Potts 
model on a A/" = 16 square lattice with periodic boundary 
conditions as given by the random-cluster (RC) and energy 
representation (EM) multi-histogram analyses from simula- 
tions for different thermal couplings K. The transition point 
of the infinite system is given by Kt — ln(l -|- vTO) ~ 1.426 
0- 



to Ml {b,n) for each bond configuration G' with 
{fe(G')''= b,n{G') = n}, where <(G") should 
be taken for non-percolating bond configurations, 
adding Efc-CG')} '^c (G")]^ to Mlj,,^{b,n), and adding 
X){c*(G')>["'c (G")]^ to M3 p_^(6, n) for each such observed 



configuration. Then, if we define 
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we have the foUowing estimates for mi(b,n), ml^^b^n), 
and m1^{b, n), 
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i.e., the mean square number of sites in non-percolating 
clusters for those subgraphs, and 
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i.e., the mean squared sum of the size of non- 
percolating clusters. These microcanonical averages ob- 
viously can be estimated by adding X^fc^fG')} "c (^0 



which, finally, result in the following expressions for the 
(zero-field) magnetization m and the magnetic suscepti- 
bility X, 
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cf. Appendix A. Note, that we simply add up histograms 
from different simulations in Eqs. (EQ) and ISWj without 
using any reweighting factors in p and q. This is cor- 
rect since the conditional probability of the occurrence 



of, say, a given number of sites in percolating clusters in 
a subgraph with b active bonds and n clusters does no 
longer depend on p and q. The order parameter m of the 
Potts model is given by [Q 
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and the corresponding, rescaled susceptibility is x = [{l^ 

As a first comparative test for the method we per- 
formed a Swendsen-Wang cluster MC simulation for the 
Ising model case {q = 2) on a small, simple-cubic Af — 
16^ lattice with periodic boundary conditions. We gath- 
ered histograms from 9 different simulations at the cou- 
plings iC^ = 0.1,0.2, ... ,0.8 and K = Kc = ^\n{l + V2), 
where the couplings are given in the language of the 
Ising model in this case, i.e., are half of the couplings 
of the corresponding q = 2 Potts model. Each run 
sampled 2^^ — 131 072 bond configurations resulting 
in corresponding time series of (6, n) and of the en- 
ergy/magnetization pairs {E, M) for comparison with the 
EM histograming method. Thus, any differences in the 
results must be solely due to the method of data anal- 
ysis, the underlying simulation data being exactly iden- 
tical. For the EM histograms throughout this paper we 
use a multi-histogram analysis according to Ref. [EJ very 
similar to that presented for the RC histograms in Sec. 
III. Figure IT] shows the results in comparison to the ex- 
act expression for F and C^ on square lattices as given 
by Kaufman |15|] and analysed by Ferdinand and Fisher 
[ pT[ . Statistical errors for both analysis schemes were 
evaluated using the "jackknife" error estimation tech- 
nique [|2| . The relative deviations (C^ — C„)/C„ from the 
exact result are noticeably larger for the RC histogram 
analysis, however in agreement with statistical errors in 
both cases, cp. Fig. |^(b). The same holds true for the 
internal energy U given by the different estimates, which 
is not shown in Fig. |l|. Note that for the energy related 
observables from Eq. ( p^ ) only the limiting distribution 
Pp^qib) is needed, which, in general, has a different width 
than the distribution of energies Pp_q{E), thus leading to 
different variances. 

In fact, by inspection of the random-cluster expression 
for the specific heat Eq. ( p^ ) and comparison with its def- 
inition in the energy language as Cy — K^{{E'^) — (E)^) 
we can infer the following relation between the variances 
of energy estimates in the RC and EM schemes: 
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Thus, energy estimates from RC histograms are always 
less precise than those from EM histograms, regardless of 
the temperature. Figure ||(a) shows the ratio of jackknife- 
estimated variances of the two different estimates of in- 
ternal energy, compared to the result from Eq. (M) with 
the exact expressions for U and C^ for the q = 2 case in- 
serted [^. Note, that from Fig. |2|(a) this quantity seems 



to have extremely small finite-size corrections. As a re- 
minder, this shows clearly that cluster estimators are not 
always improved estimators [|6| , but sometimes "deteri- 
orated estimators". Note, however, that this effect will 
decrease with increasing number of states q, at least in 
the transition region, since the singularity in C^ sharpens 
in this limit, whereas the energies U always stay in the 
range < U /N < 2. For the q = 10 model is has been 
observed that at the transition point Pp^q{b) is almost in- 
distinguishable from Pp^q{E), when suitably rescaled p7[ . 
The minimum of the exact curve of Fig. at the critical 
point is somewhat in contrast to the usual notion that 
cluster estimators work best off criticality 0; this result, 
however, applies to the spin-spin correlation function at 
medium and long distances and to magnetic observables 
like the susceptibility, which is the integral of the correla- 
tion function, whereas the internal energy U constitutes 
the extreme short distance limit of this quantity. For the 
magnetic observables m and x the situation is reversed, 
the variance of the RC estimators being strongly reduced 
as compared to the EM estimators, cp. Fig. ||(b). Note 
that in contrast to the EM case, the RC estimators pro- 
vide a single consistent definition of m and x for both, 
the broken and unbroken phases, cf. Appendix A. 

For the free energy, on the other hand, deviations for 
the RC method are by far smaller than those of the EM 
method, cp. Fig. 0(c). Moreover, deviations are not cov- 
ered by statistical errors in the latter case, a fact we will 
comment on later in the next Section. In the EM case F 
is being fixed by making contact with the non-interacting 
limit K — resp. p — 0, where 
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so that —KF{p = 0)/J\f — Inq. This equation corre- 
sponds to the normalization condition Eq. (ph. It is obvi- 
ous that having a normalization condition for each num- 
ber b of active bonds and therefore, implicitly, for each 
(microcanonical) temperature in the RC case allows for 
accurate estimation of the free energy even far away from 
K = 0, whereas for EM histograms the results deterio- 
rate with the distance from the only normalization point 
K = 0. Thus, for sampling free energies the RC multi- 
histogram technique normalized by Eq. (ph seems to be 
a good choice. 

As a slightly less trivial example we performed simu- 
lations for the g = 10 Potts model on the same lattice, 
which exhibits a strongly first-order phase transition. It 
is well known that cluster algorithms are not efficient to 
reduce the "super-critical" (exponentially strong) slowing 
down of the local MC dynamics at first-order transitions. 
For the small lattice under consideration, however, auto- 
correlation times are still quite moderate, so that one gets 
reliable results without having to resort to more sophis- 
ticated methods like multi-canonical simulations fl^, p8J . 
We gathered data from 11 single-histogram simulations 
at couplings K, = 0.8, 0.9, . . . , 1.8 with 2^° = 1 048 576 
measurements each. Figure shows the quite astonish- 
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FIG. 4: Internal energy of the g = 10 Potts model on a 
A/" = 16 square lattice, reweighted from the q = 2 model 
simulations shown in Fig. nl using multiple RC histograms 
according to Eq. (H). The g = 10 results from the EM multi- 
histogram analysis are shown for comparison. 
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FIG. 5: Density of states for the q = 2 Potts model in two di- 
mensions on a A/" = 16 lattice from single-histogram cluster- 
update simulations at coupling K = 0.4, 0.8 and 1.2 using the 
estimator Eq. (E6|). The solid line shows the exact result of 
Ref. M. 



ing results for the internal energy from this simulation 
data using the analyses in the RC and EM languages, 
respectively. Naturally, we do not have exact results to 
compare with in this case; nevertheless, the results from 
the EM analysis are completely in agreement with our 
expectations and also well compatible with results from 
previous simulations ||l^]. So, obviously, the results from 
the RC histogram analysis are strikingly wrong — and 
in a way that is clearly not covered by the present statis- 
tical errors. Obviously, the results for the specific heat, 
which are not shown, look even worse, with a pronounced, 
unphysical double-peak resulting from the deviations in 
internal energy shown in Fig. 0. 

Since as one of its major strengths in the RC approach 
we have the possibility of reweighting in the parameter 
q also, as a first clue to the reason for this conspicuous 
failure we show the outcome of using the q — 2 simula- 
tion data from above for determining the internal energy 
of the g = 10 case, cp. Fig. H. The agreement with the 
direct EM analysis of the q = 10 simulations is remark- 
ably good considering the large distance in q between the 
simulation and analysis points. Comparing Figs, y and 
B it is quite natural to suspect that the application of 
the normalization condition (ra) is not a proper choice for 
simulation data from larger q models. 



III. RC HISTOGRAMS AND ADAPTIVE 
NORMALIZATION 

To understand this normalization problem let us 
shortly go back to the sampling of the energy density 
of states n{E) for the case of the two-dimensional q ~ 2 
(Ising) model. Here, exact results are not only available 
for thermal averages, but for Q{E) itself [Bol. Using the 



K — normalization condition (pSD, a single- histogram 
estimator for the density of states in the energy language 
would be given by 
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This works quite well in the high-temperature phase 
and for small lattices. For lower temperatures, however 
the histogram loses contact with the normalization point 
K = 0, resulting in large deviations from the correct nor- 
malization, cp. Fig. |5|. Clearly, each simulation samples 
only a rather small window of energy space; from the ex- 
ponential in the denominator of Eq. (|2q), however, con- 
figurations near the maximal energy E = —Af receive the 
largest weight in the sum, so that missing those configu- 
rations, which is the case for large K, results in an expo- 
nentially wrong normalization factor (linear in \nQ,(E)). 
In other words, the absolute normalization condition ( |25| ) 
reweights the histogram data to the point K — 0, which 
will have no reliable outcome if the overlap between the 
histograms at the simulation coupling and at ii" = is 
too small or even vanishes. Note also, that the statistical 
error bars given in Fig. pi do not reflect this fundamental 
failure, although it is statistical in nature. This is due to 
the fact that the usual implementation of error estima- 
tion schemes for histograms take the error of histogram 
bins without entries to be zero, whereas according to 1/A'^ 
statistics it should in some sense be considered infinitely 
large. 

Expecting a similar sampling-related normalization 
failure for the RC histograms normalized by the condi- 
tion (0) let us relax this absolute normalization and ap- 
ply an adaptive normalization scheme as in the original 
EM multi-histograming formulation of Ref. B . Consider 
single-histogram estimates of the partition function from 
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FIG. 6: Internal energy and free energy of the g = 10 Potts 
model on a A/" = 16 square lattice as computed by the adap- 
tively normalized RC multi-histograming scheme according to 
Eqs. (Q and @. The resuhs from the EM multi-histogram 
analysis of the same data are shown for comparison. 



simulations at {pt, qi) and define reduced free energies Ti 
as 

^. = -;^ln^P..,.(a) = -f-£. (27) 

From Eq. (|j) estimates of the density g{b,n) from the 
single histograms Hp.,q.{b,n) = H^^\b^n) are given by: 



5«(6,n) = e 



r^ 



H^^^b,n) 



p\{l-Pif-UfN, 



(28) 



Once again combining these estimates in a variance- 
optimized way as above in Section II, treating the Ti as 
parameters with zero variance, we arrive at the following 
expression 



g{b,n) = 



Y.^N,e'^^p\{l-p,Y-^q^ 



Y.J N] e-2^^- pf (1 - p,)2(£-^) qj'^[H('){b, n)]-^ 

(29) 
Now, from this estimate one has the following a posteriori 
relation for computation of the parameters Ti: 



Af 



^^' = EE5(^")p'(i-pO^^Nr 



(30) 
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Eqs. (P9| ) and (|o|) form a pair of equations to be solved 
self-consistently for the determination of the parameters 
J-'i, which can be straightforwardly iterated by plugging 
in the results for !Fi from Eq. ( |29| ) into Eq. ( pO| ) and 
vice versa. One can improve on that by applying more 
sophisticated iteration schemes like, e.g., the Newton- 
Raphson iteration |£1|. We find, however, that the radius 
of convergence of this method is quite small; therefore, 
we adaptively revert to the simple iteration if the proce- 
dure leaves the Newton-Raphson convergence region. It 



is obvious that for the iteration to converge, one needs 
some overlap between the H{b, n) histograms between 
"adjacent" simulations, i.e., at least pairwise overlap. 
Apart from this restriction, however, we find this iter- 
ative scheme to be very well-behaved, converging rapidly 
in every case that fulfils the overlap-condition. 

To get started, we use first-guess values of the Ti from 
thermodynamic integration. Assume that the simulation 
points (i) = {Pi,qi) are ordered such that the histograms 
of (i) and (i + 1) have reasonable overlap; then 
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(31) 

is a good starting point for the described iteration 
scheme. Ti can be chosen arbitrarily, since the given 
pair of equations is obviously invariant under a global 
shift Ti ^r Ti— T\. Thus, we have determined the final 
estimate g{b,n) only up to a global factor. To fix this 
last normalization we propose two different possibilities; 
on the one hand, we can use the free model limit, i.e., 
evaluate ^p=o,g from Eq. ( |30| ) and use Eq. ( p5| ) for any 

q- 



T, 



p=0,q 



\nZp^Q,q{g) ~ KE ^ Aflnq. 



(32) 



On the other hand, also the q — 1 partition function is 
trivial, 
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(33) 



and can serve as normalization point for arbitrary p. In 
practice the best choice depends on the set of simulated 
couplings (pi, qi): for large-g simulations one might want 
to resort to Eq. ([3^), while otherwise Eq. ( |3^ ) should be 
the better choice. 

Now, we can re-consider the internal energy of the 
q = 10 case from above with the new, adaptively nor- 
malized RC multi-histograming scheme. Figure shows 
internal energy and free energy from this analysis as com- 
pared to the EM multi-histogram approach. As far as the 
error estimates are concerned, we apply the jackknife pro- 
cess to the whole iteration run, i.e., the iteration scheme 
for fixing the weights J-'i is done for each jackknife block 
of data separately, taking full account of statistical errors. 
Clearly, now the results from both approaches perfectly 
agree, the deviations of Fig. ^ have vanished. As antici- 
pated in Sec. II, also the cluster estimator for the internal 
energy performs noticeably better than in the q = 2 case, 
such that — at least in the critical region — it is quite 
comparable in precision to the EM estimator, cp. Fig. |^. 

For the free energy it is obvious that with the adap- 
tive normalization scheme of RC histograms we lose the 
especially high precision throughout the whole K region 
obtained by the application of the sum rule (^) in Fig. |l| 
for the Ising model. To amend this, having fixed the rel- 
ative normalization of the single histograms adaptively. 
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FIG. 7: Ratio of standard deviations for estimates of the in- 
ternal energy U of the two-dimensional q = 10 Potts model 
on a Af — 16^ lattice and RC and EM multi-histogram anal- 
yses as a function of the coupling K = l3J. The variances are 
estimated by a "jackknife" time-series analysisj_2| . The RC 
histograms are normalized according to Eqs. (|29|y~and (|3C|). 




FIG. 8: Density plot of the relative differences [g'^''"^ - 
ff ]/5 ° of the density of states as sampled from the g = 10 
Potts model on a A/" = 16^ lattice by the absolutely nor- 
malized RC histograming scheme of Eq. (nj) (ff'^'"'^) and the 
adaptively normalized scheme of Eq. (M) (^"^° ), respectively. 



Dark shading indicates that g^^ "' {b, n) > g^'^'^ ' (6, n) and vice 



iM)/ 



one might consider applying the sum rule (pi) to the final 
result gibju) instead of using the normalizations Eq. ( |3^ ) 
or Eq. (^3)- This, however, gives results looking almost 
identical to those shown above in Fig. a, i.e., the large 
deviations reappear, which clearly reveals the source they 
are resulting from. 



IV. COMPARISON OF THE METHODS 

The effect of this normalization problem should also be 
clearly seen in the final estimates for the density of states 
g{b, n) from the two RC histograming methods. In Fig. g 
we show a density plot of the relative differences of the 
estimated density of states g{h, n) from the absolutely 
normalized histograming scheme of Eq. (H), g'^'^^^Hh, n), 
and of the adaptively normalized scheme of Eqs. ( P9[ ) and 
®, g^"'^'>{b,n), i.e., the quantity 



A(6,n) 



^(abs)(^^^)_^(rel)(^^^) 

g(''<=i) (6, n) 



(34) 



Note that the range of possible value pairs (6, n) is re- 
stricted by two simple bounds in the (6, n) plane. First, 
starting from the point (6 = 0, n = Af) each added bond 
can at most reduce the number of clusters by one, namely 
by connecting two previously unconnected clusters, i.e., 
one has 



>Af-b. 



(35) 



On the other hand, starting from the "opposite" point 
{b — £,n — 1) one has bJ\f/£ bonds per site, so that for 
producing a new cluster one must at least remove M/£ 



bonds, 



n^l<^{£-b), 



or, for the square lattice, 



<Af- 



1. 



(36) 



(37) 



Apart from single points near those bounds, all configu- 
rations within this triangle can actually appear in a Potts 
model simulation with non-vanishing probability. 

Now, from Fig. pi it is obvious, given that the estimate 
5^''°') {b, n) is correct up to an overall factor, that the abso- 
lutely normalized histograming estimate g^^'°^> (6, n) gives 
too large estimates for b values near the centre b ^ J\f 
as compared to the other regions of b (dark shading in 
Fig. y). Then, considering again the deviation in in- 
ternal energy shown in Fig. y, its origin becomes clear: 
the histogram Hpq{b,n) for a simulation somewhat be- 
low the transition point will be centred around the line 
6 = 6i in Fig. |t then, using the density of states estimate 
g(^^^)(^b,n) for evaluating U, the parts of the histogram 
lying to the right of 6 = 6i will have too large weight as 
compared to the values b < bi, thus by Eq. (|5|) resulting 
in a too large estimate for the internal energy U. On 
the other hand, for couplings above the transition point 
the histogram will be centred around b = bm with too 
large weights for b < bm, leading to estimates for U that 
are too low. Directly in the vicinity of the transition 
point, deviations in normalization are symmetric with 
respect to the histogram, which will be centred around 
6 = fell, thus leading to an unbiased estimate for U. This 
is exactly the behavior found in Fig. |^. Finally, contem- 
plating on the reason for the deviations in normalization 
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FIG. 9: Ratio of standard deviations of estimates of the 
free energy from the adaptively normahzed RC histogram- 
ing scheme of Eq. (g9|) with (crRc,norm) and without (anc) 
a final apphcation of the sum rule Eq. (g|) to the density of 
states after determining the weights. The used time series 
data includes both, the q = 2 and the q — 10 simulations 
reported above. 



shown in Fig. p in the first place, it becomes obvious 
that they have the same origin as those shown in Fig. H. 
The exponential factor q~" from the sum rule Eq. (Q) at- 
taches large weight to the configurations with small num- 
ber of clusters n; if, however, histograms miss entries for 
small n, as is the case for histograms in the transition 
region 6 « A/" of Fig. ||, the sum J2n ^p,q{b, n) g~" will 
become too small, resulting in too large normalization 
factors Cp_q{b). 

Thus, for the application of the sum rule (pi) to the final 
result from the adaptively normalized RC histograming 
scheme to work reliably, one always has to include his- 
tograms from small-(j simulations, like percolation (q — > 
1) or the Ising model {q = 2), that produce configura- 
tions with relatively small numbers of clusters n. To 
illustrate this, we combined the data from the q = 2 and 
q = 10 simulations reported above, used the histogram- 
to get results for q = 10 
afterwards, i.e., the nor- 



ing scheme Eqs. ( |29| ) and (pO) 
and applied the sum rule Eq. (S 
malization of the single histograms was found adaptively, 
whereas the total histogram was normalized by the sum 
rule (H); this yields results for the internal and free en- 
ergies indistinguishable from those of the pure q — 10 
results of Fig. g. For the free energy, however, the size 
of statistical errors is largely affected by the final nor- 
malization, cf. Fig. M. For most of the couplings shown, 
the estimate from the finally sum-rule-normalized den- 
sity of states is up to about 10 times more accurate in 
terms of the statistical errors. The presence and size of 
such a gain for a given coupling is not mainly physically 
motivated, but rather depends on the relation of the sim- 
ulation points [qi,pi) to the points of data analysis. 



V. CONCLUSIONS 

We have considered multi-histogram data analyses of 
time series from cluster-update Monte Carlo simulations 
of the g-state Potts model in the random-cluster lan- 
guage. Generalizing the original formulation of Ref. ||] 
to the case of simulations of different number of states 
q, we found the original ansatz of absolutely normaliz- 
ing the individual histograms with a geometrical sum 
rule for finite-length time series to produce large devia- 
tions from the expected behavior when applied to cases q 
larger then about 3 or 4 in two dimensions. We track this 
error down to a mismatch between exponential suppres- 
sion of a part of the state space (6, n) and a simultaneous 
exponential enhancement of this region in the sum rule 
Eq. (0) . To circumvent this problem, we propose a differ- 
ent ansatz, normalizing the histograms adaptively via a 
set of self-consistency equations aiming at the minimiza- 
tion of the variance of the final estimate of the density 
of states g(h, n). Absolute normalization over the whole 
temperature region can still be maintained by making 
contact with the trivial partition function of the perco- 
lation limit g — > 1 or by combining large- and small-q 
data and applying the sum rule (g) after the adaptive 
normalization. This new approach does not exhibit the 
limitations of the absolutely normalized ansatz to nvaall-q 
simulations. 

Comparing the newly introduced, adaptively normal- 
ized random-cluster "RC" multi-histogram technique 
with multi-histograming in the energy/magnetization 
"EM" language, we can make the following state- 
ments: (a) The cluster variables {b, n) form the natu- 
ral state space for the analysis of the Potts model. Us- 
ing the Swendsen-Wang cluster-update algorithm, these 
numbers are automatically known as a by-product of 
the update-steps; no additional measurement steps are 
needed, (b) The RC representation allows for reweight- 
ing in both parameters, the thermal coupling p resp. K , 
and the number of states q, without systematical errors 
as in the partial transformation of Ref. |23| . Especially, 
the model can be considered for the case of non-integer 
q. It is easy to combine data from different-g simulations 
to enhance the accuracy for large q. (c) Cluster estima- 
tors occur naturally in the RC language. Although we 
found that short-distance observables like the internal 
energy and specific heat are sampled systematically less 
acurate by cluster estimators, this situation is reversed 
for observables sensitive to long-range order like the mag- 
netization, susceptibility and correlation functions. Also, 
even short-range cluster estimators perform comparable 
to EM language estimators for larger q values, at least in 
the transition region. In the RC language, the magnetic 
observables can be defined consistently throughout the 
broken and unbroken phases, cf. Appendix A. 

Apart from that, the combination of data from small 
and large q models can serve as a new method to cope 
with the super-critical slowing down at the first-order 
transitions for large q: for sufficiently large lattices sim- 
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ulation runs will entirely stay in one of the pure phases 
depending on the initial configuration and boundary con- 
ditions due to ergodicity breaking at the transition point. 
However, combining such data with smaller-g simulations 
from the second-order or weak first-order regime allows 
the adaptive normalization scheme to still find the cor- 
rect normalization of the pure-phase histograms without 
real tunneling events. This approach is similar in spirit 
to the simulated tempering technique 1 23 , ^ . 



One might also think of applying the multi-canonical 
p8[ resp. multi-bondic U% simulation approach or one 
of the related techniques to the sampling of the den- 
sity of states g{b,n). Especially, apphcation of the ab- 
solute normalization Eq. (g) to this case might be of in- 
terest. This approach is currently under investigation. 
However, sampling the complete range of possible val- 
ues in the (5, n) plane with sufhcient accuracy is found 
to be a computationally very demanding problem. In 
contrast, in the current approach, we still stick to the 
physically sensible approach of importance-sampling, i.e., 
sampling the phase-space according to the local canonical 
weights. Furthermore, we are able to take full advantage 
of the computational gain of cluster algorithms, whereas 
the generalized-ensemble algorithms put forward so far 
employ local updates (apart from the multi-bondic algo- 
rithm of Ref. 13). 

As an interesting application of our ansatz we suggest 
the analysis of the tri-critical point Qc, where the order 
of the thermal transitions changes from second to first 
order, in three dimensions. There has been quite some 
debate about the location of this point, estimates rang- 
ing from Qc = 2.15 [|2^ to Qc = 2.6 |2^. Furthermore, the 
universality class, critical exponents etc. of this transi- 
tion have not yet been properly analyzed. A test for the 
Qc — 4, case in two dimensions shows that our method 
is well suited for such an analysis. This problem will be 
considered in a forthcoming publication. 
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APPENDIX A: CLUSTER ESTIMATORS 

Consider the Potts model coupled to an external mag- 
netic field H with Hamiltonian [27] 

W = -J^<5(fT,,aj)-i/^(5(a„l). (Al) 

Then, the random-cluster representation of the partition 
function on a graph Q consisting of M sites and £ edges 
(bonds) is given by 



G'cg 



,KS ^/(e')(l_p) 



£-b(g') 



G'cg 



n[(g-i) + e^"1, 



(A2) 



where the product runs over the set of clusters {c} of 
the subgraph Q' , Uc is the number of sites in cluster c, 
K = /3J is the thermal coupling parameter and B = (3H 
denotes the reduced magnetic field, p ~ 1 — e^^ is the 
probability for the activation of bonds. 

The zero-field, per-site internal energy u is then given 
by 
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lnZp,,(g,S = 0) 



N 



9Kl 



(e 



K 




-K^)' <-' 



which shows the close connection between the b and E 
distributions. 



The collaboration of the authors has been supported 
by DAAD-NSC grant No. D/9827248. MW acknowl- 

I 

The zero-field specific heat Cy follows from 
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In the thermodynamic limit, the per-site, zero-field "magnetization" m = (^^ 6{(7i, \)/M) is given by 
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Here, we split the cluster contributions of the sub- 
graph into percolating clusters c^ and non-percolating, 
finite clusters d^ . In the indicated order of taking the 
limits, first A/" ^ 00 and then i? -^ 0, the factors 
exp(— Bric) take the values and 1 for percolating and 
non-percolating clusters c, respectively. Thus, we arrive 
at 



m 




(A6) 



which explicitly reflects the symmetry-breaking nature of 
the percolating configurations. For the order parameter 
771, which varies between for the completely disordered 
state and 1 for the ground states, we find 



qih — 1 
9-1 




(A7) 



For finite lattices one can retain this definition since the 
notion of percolating and non-percolating clusters is still 
well-defined. Note, that this gives a consistent defini- 
tion of the order parameter throughout the disordered 
and broken phases. In contrast, in the EM language one 
has to explicitly break symmetry in the low-temperature 
phase, which is usually done by defining 

mK>Kt = ( max V'(5(cri,j) ) , (A8) 

\i<j<?Y / 

whereas for the unbroken phase one uses 



niK 



<Kt = (^Siai,!) 



(A9) 



Obviously, for finite lattices, the expectation values of the 
RC and EM definitions will not coincide exactly; critical 
exponents, however, will of course agree. 
The zero-field susceptibility x is given by 
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(AlO) 



From Eq. (AlO) one recognizes the widely used improved 
cluster estimator for the high-temperature phase, namely 
the last term. Note, however, that the original improved 
estimator includes all clusters here instead of only the 
non-percolating ones, which makes a difference for finite 
lattices. For finite lattices, once again, from Eq. ( [A10| ) 
we have a single definition for both, the unbroken and 
broken phases. 

Alternatively defining the susceptibility corresponding 



to the order parameter m we get 

2 
/ q \ 
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